Overview

Dataset statistics

Number of variables25
Number of observations10053
Missing cells12454
Missing cells (%)5.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.4 MiB
Average record size in memory151.0 B

Variable types

NUM10
CAT8
BOOL7

Warnings

property_subtype_median_facades is highly correlated with building_property_subtype_median_facadesHigh correlation
building_property_subtype_median_facades is highly correlated with property_subtype_median_facadesHigh correlation
building_state_median_price is highly correlated with building_state_aggHigh correlation
building_state_agg is highly correlated with building_state_median_priceHigh correlation
building_property_subtype_median_facades is highly correlated with property_subtype and 1 other fieldsHigh correlation
property_subtype is highly correlated with building_property_subtype_median_facades and 1 other fieldsHigh correlation
property_subtype_median_facades is highly correlated with property_subtype and 1 other fieldsHigh correlation
facades_number has 9994 (99.4%) missing values Missing
building_property_subtype_median_facades has 1230 (12.2%) missing values Missing
property_subtype_median_facades has 1230 (12.2%) missing values Missing
garden_area is highly skewed (γ1 = 28.71980321) Skewed
Unnamed: 0 has unique values Unique
rooms_number has 249 (2.5%) zeros Zeros
terrace_area has 5499 (54.7%) zeros Zeros
garden_area has 7911 (78.7%) zeros Zeros
land_surface has 5435 (54.1%) zeros Zeros

Reproduction

Analysis started2020-11-20 11:13:25.865550
Analysis finished2020-11-20 11:14:02.622503
Duration36.76 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

Unnamed: 0
Real number (ℝ≥0)

UNIQUE

Distinct10053
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5897.859942
Minimum0
Maximum11287
Zeros1
Zeros (%)< 0.1%
Memory size78.5 KiB
2020-11-20T12:14:02.796536image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile678.6
Q13138
median6068
Q38679
95-th percentile10761.4
Maximum11287
Range11287
Interquartile range (IQR)5541

Descriptive statistics

Standard deviation3219.503486
Coefficient of variation (CV)0.5458765582
Kurtosis-1.172085138
Mean5897.859942
Median Absolute Deviation (MAD)2760
Skewness-0.09608453574
Sum59291186
Variance10365202.7
MonotocityStrictly increasing
2020-11-20T12:14:03.001496image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
20471< 0.1%
 
33551< 0.1%
 
74651< 0.1%
 
54161< 0.1%
 
95101< 0.1%
 
33631< 0.1%
 
74571< 0.1%
 
54081< 0.1%
 
95021< 0.1%
 
13061< 0.1%
 
Other values (10043)1004399.9%
 
ValueCountFrequency (%) 
01< 0.1%
 
11< 0.1%
 
21< 0.1%
 
31< 0.1%
 
41< 0.1%
 
ValueCountFrequency (%) 
112871< 0.1%
 
112861< 0.1%
 
112841< 0.1%
 
112831< 0.1%
 
112821< 0.1%
 

source
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.5 KiB
6
9994 
4
 
59
ValueCountFrequency (%) 
6999499.4%
 
4590.6%
 
2020-11-20T12:14:03.194496image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-11-20T12:14:03.295538image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:14:03.589501image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

postcode
Real number (ℝ≥0)

Distinct812
Distinct (%)8.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4954.659206
Minimum1000
Maximum9992
Zeros0
Zeros (%)0.0%
Memory size78.5 KiB
2020-11-20T12:14:03.763501image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile1040
Q11490
median4500
Q38370
95-th percentile9420
Maximum9992
Range8992
Interquartile range (IQR)6880

Descriptive statistics

Standard deviation3187.617962
Coefficient of variation (CV)0.6433576618
Kurtosis-1.624508924
Mean4954.659206
Median Absolute Deviation (MAD)3300
Skewness0.09993177295
Sum49809189
Variance10160908.27
MonotocityNot monotonic
2020-11-20T12:14:03.965502image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
83003453.4%
 
10002983.0%
 
90002912.9%
 
11802862.8%
 
10502162.1%
 
84001711.7%
 
40001401.4%
 
12001221.2%
 
10701191.2%
 
14201181.2%
 
Other values (802)794779.1%
 
ValueCountFrequency (%) 
10002983.0%
 
1020460.5%
 
10301121.1%
 
1040670.7%
 
10502162.1%
 
ValueCountFrequency (%) 
99921< 0.1%
 
99911< 0.1%
 
999090.1%
 
99883< 0.1%
 
99811< 0.1%
 

house_is
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.8 KiB
False
5029 
True
5024 
ValueCountFrequency (%) 
False502950.0%
 
True502450.0%
 
2020-11-20T12:14:04.099493image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

property_subtype
Categorical

HIGH CORRELATION

Distinct23
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size78.5 KiB
APARTMENT
3727 
HOUSE
3099 
VILLA
567 
MIXED_USE_BUILDING
547 
APARTMENT_BLOCK
454 
Other values (18)
1659 
ValueCountFrequency (%) 
APARTMENT372737.1%
 
HOUSE309930.8%
 
VILLA5675.6%
 
MIXED_USE_BUILDING5475.4%
 
APARTMENT_BLOCK4544.5%
 
DUPLEX3673.7%
 
PENTHOUSE2973.0%
 
GROUND_FLOOR2662.6%
 
FLAT_STUDIO2022.0%
 
MANSION1131.1%
 
Other values (13)4144.1%
 
2020-11-20T12:14:04.237498image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)< 0.1%
2020-11-20T12:14:04.438496image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length20
Median length9
Mean length8.368049339
Min length3

price
Real number (ℝ≥0)

Distinct1207
Distinct (%)12.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean371368.8246
Minimum25000
Maximum1396000
Zeros0
Zeros (%)0.0%
Memory size78.5 KiB
2020-11-20T12:14:04.651502image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum25000
5-th percentile125000
Q1215000
median299000
Q3450000
95-th percentile860000
Maximum1396000
Range1371000
Interquartile range (IQR)235000

Descriptive statistics

Standard deviation236721.1506
Coefficient of variation (CV)0.6374287094
Kurtosis3.488524833
Mean371368.8246
Median Absolute Deviation (MAD)104000
Skewness1.742836437
Sum3733370794
Variance5.603690315e+10
MonotocityNot monotonic
2020-11-20T12:14:04.855535image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2950001401.4%
 
2990001351.3%
 
2250001331.3%
 
3950001261.3%
 
1990001261.3%
 
2490001201.2%
 
2750001181.2%
 
3250001031.0%
 
250000981.0%
 
245000981.0%
 
Other values (1197)885688.1%
 
ValueCountFrequency (%) 
250001< 0.1%
 
300002< 0.1%
 
390002< 0.1%
 
400001< 0.1%
 
450001< 0.1%
 
ValueCountFrequency (%) 
13960001< 0.1%
 
1395000170.2%
 
139000070.1%
 
13850004< 0.1%
 
13800002< 0.1%
 

rooms_number
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.764547896
Minimum0
Maximum6
Zeros249
Zeros (%)2.5%
Memory size78.5 KiB
2020-11-20T12:14:05.021599image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q33
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.247029294
Coefficient of variation (CV)0.4510789254
Kurtosis0.1485538856
Mean2.764547896
Median Absolute Deviation (MAD)1
Skewness0.3751681032
Sum27792
Variance1.55508206
MonotocityNot monotonic
2020-11-20T12:14:05.144135image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
2317631.6%
 
3317131.5%
 
4149714.9%
 
1103710.3%
 
56366.3%
 
62872.9%
 
02492.5%
 
ValueCountFrequency (%) 
02492.5%
 
1103710.3%
 
2317631.6%
 
3317131.5%
 
4149714.9%
 
ValueCountFrequency (%) 
62872.9%
 
56366.3%
 
4149714.9%
 
3317131.5%
 
2317631.6%
 

area
Real number (ℝ≥0)

Distinct427
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean155.398488
Minimum5
Maximum470
Zeros0
Zeros (%)0.0%
Memory size78.5 KiB
2020-11-20T12:14:05.312800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile56
Q193
median135
Q3200
95-th percentile330
Maximum470
Range465
Interquartile range (IQR)107

Descriptive statistics

Standard deviation84.26146565
Coefficient of variation (CV)0.5422283494
Kurtosis1.151725891
Mean155.398488
Median Absolute Deviation (MAD)48
Skewness1.150419671
Sum1562221
Variance7099.994594
MonotocityNot monotonic
2020-11-20T12:14:05.501803image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1202162.1%
 
1002082.1%
 
902032.0%
 
1502022.0%
 
2001891.9%
 
1101701.7%
 
801701.7%
 
1601681.7%
 
1301561.6%
 
1401541.5%
 
Other values (417)821781.7%
 
ValueCountFrequency (%) 
52< 0.1%
 
152< 0.1%
 
164< 0.1%
 
174< 0.1%
 
182< 0.1%
 
ValueCountFrequency (%) 
470100.1%
 
4681< 0.1%
 
4671< 0.1%
 
4651< 0.1%
 
4621< 0.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.8 KiB
True
8480 
False
1573 
ValueCountFrequency (%) 
True848084.4%
 
False157315.6%
 
2020-11-20T12:14:05.622764image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

furnished
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.8 KiB
False
9687 
True
 
366
ValueCountFrequency (%) 
False968796.4%
 
True3663.6%
 
2020-11-20T12:14:05.677766image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

open_fire
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.8 KiB
False
9486 
True
 
567
ValueCountFrequency (%) 
False948694.4%
 
True5675.6%
 
2020-11-20T12:14:05.730763image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

terrace
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.8 KiB
True
6713 
False
3340 
ValueCountFrequency (%) 
True671366.8%
 
False334033.2%
 
2020-11-20T12:14:05.784762image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

terrace_area
Real number (ℝ≥0)

ZEROS

Distinct134
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.10156172
Minimum0
Maximum708
Zeros5499
Zeros (%)54.7%
Memory size78.5 KiB
2020-11-20T12:14:05.900799image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q315
95-th percentile50
Maximum708
Range708
Interquartile range (IQR)15

Descriptive statistics

Standard deviation22.84805366
Coefficient of variation (CV)2.058093648
Kurtosis129.7892583
Mean11.10156172
Median Absolute Deviation (MAD)0
Skewness7.26852841
Sum111604
Variance522.0335561
MonotocityNot monotonic
2020-11-20T12:14:06.081800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0549954.7%
 
202742.7%
 
102522.5%
 
152242.2%
 
82062.0%
 
301972.0%
 
61871.9%
 
121871.9%
 
251731.7%
 
41541.5%
 
Other values (124)270026.9%
 
ValueCountFrequency (%) 
0549954.7%
 
1270.3%
 
2830.8%
 
31271.3%
 
41541.5%
 
ValueCountFrequency (%) 
7081< 0.1%
 
4501< 0.1%
 
4001< 0.1%
 
3501< 0.1%
 
3301< 0.1%
 

garden
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.8 KiB
False
7911 
True
2142 
ValueCountFrequency (%) 
False791178.7%
 
True214221.3%
 
2020-11-20T12:14:06.214762image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

garden_area
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct586
Distinct (%)5.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean124.184721
Minimum0
Maximum40000
Zeros7911
Zeros (%)78.7%
Memory size78.5 KiB
2020-11-20T12:14:06.338799image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile613.2
Maximum40000
Range40000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation860.1838449
Coefficient of variation (CV)6.926647966
Kurtosis1125.943137
Mean124.184721
Median Absolute Deviation (MAD)0
Skewness28.71980321
Sum1248429
Variance739916.2471
MonotocityNot monotonic
2020-11-20T12:14:06.546768image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0791178.7%
 
100840.8%
 
50570.6%
 
300540.5%
 
200530.5%
 
400470.5%
 
500440.4%
 
150420.4%
 
30400.4%
 
1380.4%
 
Other values (576)168316.7%
 
ValueCountFrequency (%) 
0791178.7%
 
1380.4%
 
21< 0.1%
 
31< 0.1%
 
51< 0.1%
 
ValueCountFrequency (%) 
400002< 0.1%
 
294001< 0.1%
 
178001< 0.1%
 
156012< 0.1%
 
150001< 0.1%
 

land_surface
Real number (ℝ≥0)

ZEROS

Distinct1430
Distinct (%)14.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean292.9476773
Minimum0
Maximum4170
Zeros5435
Zeros (%)54.1%
Memory size78.5 KiB
2020-11-20T12:14:06.770866image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3312
95-th percentile1500
Maximum4170
Range4170
Interquartile range (IQR)312

Descriptive statistics

Standard deviation576.0962548
Coefficient of variation (CV)1.96655
Kurtosis10.9323011
Mean292.9476773
Median Absolute Deviation (MAD)0
Skewness3.058475153
Sum2945003
Variance331886.8948
MonotocityNot monotonic
2020-11-20T12:14:06.967866image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0543554.1%
 
150500.5%
 
100490.5%
 
120440.4%
 
110400.4%
 
200400.4%
 
70380.4%
 
90350.3%
 
160350.3%
 
300340.3%
 
Other values (1420)425342.3%
 
ValueCountFrequency (%) 
0543554.1%
 
180.1%
 
41< 0.1%
 
51< 0.1%
 
61< 0.1%
 
ValueCountFrequency (%) 
41701< 0.1%
 
41261< 0.1%
 
41001< 0.1%
 
40651< 0.1%
 
40003< 0.1%
 

facades_number
Categorical

MISSING

Distinct4
Distinct (%)6.8%
Missing9994
Missing (%)99.4%
Memory size78.5 KiB
2
27 
3
18 
4
12 
1
 
2
ValueCountFrequency (%) 
2270.3%
 
3180.2%
 
4120.1%
 
12< 0.1%
 
(Missing)999499.4%
 
2020-11-20T12:14:07.149865image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-11-20T12:14:07.245833image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:14:07.362869image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.8 KiB
False
9871 
True
 
182
ValueCountFrequency (%) 
False987198.2%
 
True1821.8%
 
2020-11-20T12:14:07.458866image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

region
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.5 KiB
F
5089 
W
3102 
B
1862 
ValueCountFrequency (%) 
F508950.6%
 
W310230.9%
 
B186218.5%
 
2020-11-20T12:14:07.570866image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-11-20T12:14:07.681865image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:14:07.793829image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

building_state_agg
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.5 KiB
good
7581 
to_renovate
1745 
renovated
 
727
ValueCountFrequency (%) 
good758175.4%
 
to_renovate174517.4%
 
renovated7277.2%
 
2020-11-20T12:14:07.953828image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-11-20T12:14:08.066866image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:14:08.199865image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length11
Median length4
Mean length5.576643788
Min length4

postcode_median_price
Real number (ℝ≥0)

Distinct402
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean335273.9037
Minimum65000
Maximum1350000
Zeros0
Zeros (%)0.0%
Memory size78.5 KiB
2020-11-20T12:14:08.426831image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum65000
5-th percentile167500
Q1237500
median298400
Q3399000
95-th percentile620000
Maximum1350000
Range1285000
Interquartile range (IQR)161500

Descriptive statistics

Standard deviation136471.0566
Coefficient of variation (CV)0.4070434804
Kurtosis0.6017389488
Mean335273.9037
Median Absolute Deviation (MAD)73400
Skewness0.9867318184
Sum3370508554
Variance1.86243493e+10
MonotocityNot monotonic
2020-11-20T12:14:08.742827image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
6490003453.4%
 
3990003123.1%
 
3790002932.9%
 
5600002862.8%
 
2250002712.7%
 
2350002182.2%
 
6010002162.1%
 
2690001861.9%
 
3250001811.8%
 
2950001641.6%
 
Other values (392)758175.4%
 
ValueCountFrequency (%) 
650001< 0.1%
 
700001< 0.1%
 
795001< 0.1%
 
920001< 0.1%
 
980001< 0.1%
 
ValueCountFrequency (%) 
13500001< 0.1%
 
11500001< 0.1%
 
9250005< 0.1%
 
8450001< 0.1%
 
7990001< 0.1%
 

building_state_median_price
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.5 KiB
320000
7581 
230000
1745 
310000
 
727
ValueCountFrequency (%) 
320000758175.4%
 
230000174517.4%
 
3100007277.2%
 
2020-11-20T12:14:09.102830image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-11-20T12:14:09.280833image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:14:09.453827image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length8
Median length8
Mean length8
Min length8

property_subtype_median_price
Real number (ℝ≥0)

Distinct22
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean316466.8756
Minimum118000
Maximum652500
Zeros0
Zeros (%)0.0%
Memory size78.5 KiB
2020-11-20T12:14:09.624830image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum118000
5-th percentile282500
Q1282500
median288000
Q3310000
95-th percentile540000
Maximum652500
Range534500
Interquartile range (IQR)27500

Descriptive statistics

Standard deviation80490.29644
Coefficient of variation (CV)0.2543403517
Kurtosis3.923005571
Mean316466.8756
Median Absolute Deviation (MAD)5500
Skewness1.960431562
Sum3181441500
Variance6478687820
MonotocityNot monotonic
2020-11-20T12:14:09.879834image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%) 
282500372737.1%
 
288000309930.8%
 
5400005675.6%
 
3100005475.4%
 
3575004544.5%
 
3250003673.7%
 
4950002973.0%
 
3150002662.6%
 
1490002022.0%
 
5250001131.1%
 
Other values (12)4144.1%
 
ValueCountFrequency (%) 
1180005< 0.1%
 
1220005< 0.1%
 
1490002022.0%
 
238000750.7%
 
282500372737.1%
 
ValueCountFrequency (%) 
652500620.6%
 
5400005675.6%
 
5350001< 0.1%
 
5250001131.1%
 
4950002973.0%
 

building_property_subtype_median_facades
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct3
Distinct (%)< 0.1%
Missing1230
Missing (%)12.2%
Memory size78.5 KiB
2
4589 
3
3605 
4
629 
ValueCountFrequency (%) 
2458945.6%
 
3360535.9%
 
46296.3%
 
(Missing)123012.2%
 
2020-11-20T12:14:10.397868image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-11-20T12:14:10.539889image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:14:10.666351image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3

property_subtype_median_facades
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct3
Distinct (%)< 0.1%
Missing1230
Missing (%)12.2%
Memory size78.5 KiB
2
4589 
3
3605 
4
629 
ValueCountFrequency (%) 
2458945.6%
 
3360535.9%
 
46296.3%
 
(Missing)123012.2%
 
2020-11-20T12:14:10.842347image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-11-20T12:14:10.973364image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:14:11.094312image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3

Interactions

2020-11-20T12:13:43.971188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:44.173365image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:44.328417image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:44.495380image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:44.658402image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:44.805378image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:44.964374image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:45.131341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:45.285342image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:45.447341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:45.602337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:45.741373image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:45.880373image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:46.029341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:46.176338image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:46.319371image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:46.465341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:46.621341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:46.766376image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:46.924376image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:47.079339image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:47.243374image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:47.400374image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:47.573341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:47.736344image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:47.890373image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:48.055341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:48.242341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:48.502336image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:48.665337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:48.836343image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:48.992341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:49.156335image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:49.338335image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:49.545335image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:49.722342image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:49.902375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:50.084372image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:50.243372image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:50.409338image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:50.570341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:50.720341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:50.868341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:51.020336image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:51.162338image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:51.299337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:51.449336image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:51.601336image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:51.736337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:51.880337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:52.032341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:52.195342image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:52.357342image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:52.531341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:52.692342image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:52.847342image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:53.011341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:53.182341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:53.339341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:53.505342image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:53.669341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:53.838341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:54.000340image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:54.173341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:54.338337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:54.493341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:54.665342image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:54.963376image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:55.121336image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:55.296341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:55.467373image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:55.611340image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:55.752341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:55.903341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:56.047337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:56.180337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:56.329337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:56.483337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:56.618336image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:56.764337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:56.907337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:57.060337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:57.216341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:57.388341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:57.552341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:57.705343image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:57.874341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:58.051337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:58.208337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:58.369341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:58.536342image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:58.691341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:58.844341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:59.007341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:59.159374image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:59.326345image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:59.509341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:59.692345image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:13:59.870372image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:14:00.057375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-11-20T12:14:11.298315image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-11-20T12:14:11.792315image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-11-20T12:14:12.247309image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-11-20T12:14:12.750348image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-11-20T12:14:13.200309image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-11-20T12:14:00.534375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:14:01.524832image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:14:02.087422image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-20T12:14:02.337534image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

Unnamed: 0sourcepostcodehouse_isproperty_subtypepricerooms_numberareaequipped_kitchen_hasfurnishedopen_fireterraceterrace_areagardengarden_arealand_surfacefacades_numberswimming_pool_hasregionbuilding_state_aggpostcode_median_pricebuilding_state_median_priceproperty_subtype_median_pricebuilding_property_subtype_median_facadesproperty_subtype_median_facades
0064180TrueMIXED_USE_BUILDING295000.03.0242.0TrueFalseFalseTrue36.0True1000.01403.0NaNFalseWgood229000.0320000.0310000.02.02.0
1168730TrueVILLA675000.04.0349.0TrueFalseFalseFalse0.0True977.01526.0NaNFalseFgood241000.0320000.0540000.04.04.0
2264020TrueAPARTMENT_BLOCK250000.05.0303.0TrueFalseFalseFalse0.0False0.0760.0NaNFalseWto_renovate195000.0230000.0357500.0NaNNaN
3361200TrueHOUSE545000.04.0235.0TrueTrueFalseFalse0.0False0.063.0NaNFalseBrenovated445000.0310000.0288000.03.03.0
4461190TrueMIXED_USE_BUILDING500000.02.0220.0TrueFalseFalseFalse0.0True60.0193.0NaNFalseBgood360000.0320000.0310000.02.02.0
5564040TrueHOUSE189000.03.0200.0TrueFalseFalseFalse0.0True40.0100.0NaNFalseWto_renovate229000.0230000.0288000.03.03.0
6664540TrueMIXED_USE_BUILDING465000.04.0400.0TrueFalseFalseFalse0.0False0.0312.0NaNFalseWgood175000.0320000.0310000.02.02.0
7761150TrueAPARTMENT_BLOCK650000.04.0200.0TrueFalseFalseTrue4.0True150.0301.0NaNFalseBgood620000.0320000.0357500.0NaNNaN
8866870TrueMIXED_USE_BUILDING89000.03.0180.0TrueFalseFalseFalse0.0False0.096.0NaNFalseWto_renovate124700.0230000.0310000.02.02.0
9964030TrueMIXED_USE_BUILDING129000.03.0156.0TrueFalseFalseFalse0.0False0.071.0NaNFalseWto_renovate190000.0230000.0310000.02.02.0

Last rows

Unnamed: 0sourcepostcodehouse_isproperty_subtypepricerooms_numberareaequipped_kitchen_hasfurnishedopen_fireterraceterrace_areagardengarden_arealand_surfacefacades_numberswimming_pool_hasregionbuilding_state_aggpostcode_median_pricebuilding_state_median_priceproperty_subtype_median_pricebuilding_property_subtype_median_facadesproperty_subtype_median_facades
100431127749690FalseAPARTMENT315000.03.0192.0TrueFalseFalseTrue48.0False0.00.03.0FalseFrenovated299000.0310000.0282500.02.02.0
100441127848300FalseAPARTMENT490000.02.091.0TrueFalseFalseFalse0.0False0.00.02.0FalseFgood649000.0320000.0282500.02.02.0
100451127948800FalseAPARTMENT265000.03.0138.0TrueFalseFalseFalse0.0False0.00.02.0FalseFgood240000.0320000.0282500.02.02.0
100461128046000FalseAPARTMENT99000.02.091.0TrueFalseFalseFalse0.0False0.00.02.0FalseWto_renovate154500.0230000.0282500.02.02.0
100471128142950FalseLOFT410000.03.0150.0TrueFalseFalseTrue41.0False0.00.03.0FalseFgood343500.0320000.0422000.03.03.0
100481128244000FalseAPARTMENT245000.02.0103.0FalseFalseFalseTrue5.0False0.00.02.0FalseWgood225000.0320000.0282500.02.02.0
100491128348790FalseAPARTMENT250000.01.0300.0FalseFalseFalseFalse0.0False0.00.02.0FalseFgood257000.0320000.0282500.02.02.0
100501128442018FalseAPARTMENT298000.01.071.0TrueFalseFalseTrue12.0False0.00.01.0FalseFgood443475.0320000.0282500.02.02.0
100511128642000FalseFLAT_STUDIO150000.01.040.0TrueFalseFalseFalse0.0False0.00.02.0FalseFto_renovate497000.0230000.0149000.02.02.0
100521128742060FalseAPARTMENT228009.02.080.0TrueFalseFalseFalse0.0False0.00.02.0FalseFgood299000.0320000.0282500.02.02.0